I'm working on a MARL project. The observation is a (31,1) vector that I first process with a few fully connected layers. Then, the output is sent into a recurrent policy. Now, for some reason, after a few million steps of training, the observation gets sent into the first FC and becomes a matrix of NaNs. I checked and there are no NaNs in the observation. Example of the observation from the last crash:
ā
```
tensor([[ 2.8740e-02, 2.2078e-02, 1.9542e-02, ..., -3.3949e-01,
6.2327e-02, -2.8951e-04],
[ 4.0109e-02, 2.2649e-02, 2.0599e-02, ..., -3.3947e-01,
5.5702e-02, -5.4328e-05],
[ 5.1799e-02, 2.3269e-02, 2.1813e-02, ..., -3.4162e-01,
5.3501e-02, -8.1255e-04],
...,
[ 1.7621e-01, 2.1108e-03, 1.4367e-02, ..., -3.4072e-01,
4.2021e-02, -1.3159e-02],
[ 1.7600e-01, -2.2701e-05, 1.2215e-02, ..., -3.4045e-01,
4.2869e-02, -1.3915e-02],
[ 1.7618e-01, 4.4542e-04, 1.2899e-02, ..., -3.4266e-01,
4.4017e-02, -1.8093e-02]], device='cuda:0')
```
ā
I've tried a few things that did not work: using LeakyReLu instead of ReLu and removing Layer Normalization.
ā
Do you have any tips?
TL;DR Any ideas on why a fully connected layer that processes the observation outputs NaNs after a few million steps?
submitted by /u/No_Possibility_7588
[link] [comments]
( 52
min )